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ABSTRACT 



This paper describes the different methods used to evaluate 
individually configurable multimedia materials developed for the Horizon 
project, a program designed to increase employment opportunities for students 
with disabilities or learning difficulties. The project established a working 
cafe/restaurant in East London staffed by the students. Part of the project 
involved the creation of multimedia units, linked directly to Level 1 
National Vocational Qualifications (NVQ) in Catering and Business Studies, to 
support the training of the Cafe Horizon workers. Cafe workers attended a 
college one day each week where they used multimedia materials to work toward 
their NVQ in Catering. Learners worked individually or in small groups with 
specialist support workers who assisted users and participated in evaluation 
of the software. This paper discusses the five methods used for formative and 
summative evaluation of the project and some benefits and limitations of each 
method are presented. Evaluation methods included: expert evaluation, 
analysis of logged data, questionnaire methods, interview methods, and video 
methods. The difficulties in evaluating the program are also reviewed. 

Results found that a combination approach to evaluation was better than any 
single method at identifying problems, particularly when methods were 
combined with expert evaluation. (Contains 21 references.) (CR) 
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Abstract 



Horizon is a European funded project whose aim is to increase employment 
opportunities for students with disabilities or learning difficulties. A working 
cafe/ restaurant (Cafe Horizon) has been established in East London, staffed 
by students involved in this project. Similar initiatives are taking place at 
other locations in this country and in Ireland and Spain. Part of the project 
involved the creation of multimedia units, linked directly to Level 1 National 
Vocational Qualifications (NVQ) in Catering and Business Studies, to support 
the training of these workers. In this paper we describe the design of a 
practical approach to evaluating individually configurable multimedia materials 
developed for the Horizon project. These materials were created according to 
constructivist theories of learning using a simple student model to hold user 
information. In this way, individual configuration of learning was achieved. 
Evaluation objectives were established early in the project cycle. These were 
closely related to the working environment and to individual learning 
objectives set for each learner. Five methods used for formative and 
summative evaluations in the project are described and some benefits and 
limitations of each method are presented. 

Introduction 

Horizon is a European project, the aim of which is to increase employment 
opportunities for students with disabilities and learning difficulties. In Ireland, 
Horizon workers run a public house, in Spain a restaurant and in the UK, a 
small cafe, Cafe Horizon. Multimedia learning materials have been developed 
to provide a supported learning environment that forms the basis of training 
for this work. Cafe Horizon workers attend Waltham Forest College one day 
each week where they work towards their Foundation Level National 
Vocational Qualification (NVQ) in Catering, using the multimedia materials in 
college learning centres. Materials are also being used in Ireland and Spain in 
similar ways. 

Horizon workers constitute a wide range of learners who require varying 
additional support. Some have severe physical disability, yet in all other ways 
cannot be distinguished from other learners. Other Cafe Horizon workers 
have emotional and cognitive problems that impose severe restrictions on 
learning. Some have a combination of physical and mental disability in 
addition to problems of language. One challenge of this work was to produce 
learning material that could be configured for the learner in respect of their 
particular individual learning problems. 

Individual configuration of learning - a simple student model 

Many authors hold that constructivist theories of learning should underpin the 
development of learning applications, for example, Park and Hannafin (1993) 
and Atkins (1993) describe such an approach. To assist in this process, 
information on the learner may be held in the form of a student model which is 
a representation of the learner’s knowledge, prior skills and characteristics 
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used to configure a computer learning application (Muldner et al, 1997). 

A simple student model is used in the Horizon project, where a configuration 
file is used to individualize applications for learners. Computer based 
diagnostic tests are used in conjunction with specialist tutors to configure this 
file. The components of the student model are briefly described below. 

Components of the student model 

• Domain level: Previous subject ability is measured by a simple pre-test. 

• Language support: Material is presented at appropriate language level, 
based on the results of language pre-tests. 

• Learning style: Material is presented according to learners’ preferred 
learning style 

• Task Level: Task levels are configured for individuals as described by 
Barker and colleagues (1997a) based on Bloom’s taxonomy (Bloom 1956). 

• Questions: In-course questions are used to challenge learners and 
provide feedback. The student model selects the level and type of 
question. 

• Interface configuration: Information is held about special requirements of 
learners, for example the need for a touch screen, font size, sound setup 
and screen presentation. 

The methods used in the design and development of Horizon materials have 
been described by Barker and colleagues (Barker et al, 1997b). Our 
approach to evaluating these materials is shown below. 

Evaluation of Horizon material. 

In this section we describe the objectives of our evaluation scheme and the 
various methods we used to achieve them. 

Evaluation is important in all stages of the project life cycle according to 
Rushby (1997). Many authors have distinguished between summative and 
formative evaluation (Squires, 1996; Rushby, 1997). Formative evaluations 
are carried out throughout the development of the material and should involve 
designers, developers and a few learners according to Chanier (1996). 
Formative evaluation is used to guide the design and initial implementation of 
the package. Summative evaluation relates to the evaluation of the final 
application. Both formative and summative evaluations were used in the 
development of the Horizon materials. 

Evaluation objectives were specified early in the project. In the next section 
we outline evaluation objectives in three related areas, pedagogy, usability 
and user satisfaction. 



Evaluation objectives 

1. Assessment of Learning and Pedagogy 

Yildiz and Atkins (1993) provide guidelines for the design of evaluation, based 
of a survey of evaluations since the 1 970s. Park and Hannafin (1 993) also 
provide a set of empirically derived guidelines for conducting evaluations, 
based on user interface and pedagogical principles. Learning is 
individualized in our project by the use of individual targets and objectives for 
each learner. These were used to establish a set of objectives upon which 
pedagogical evaluation was based. The following considerations were used 
to establish the evaluation objectives: 

Were specific learning objectives supported by the application? 

Were targets achieved or not? 

Did material support constructivist learning? 

Were applications interactive and task based? 

Could learners contribute to their own learning? 

Did computer system integrate well with other systems in place? 

Were tutors involved in the course? 

Was content accurate and appropriate? 

Was course material, assessment, etc. at the appropriate level? 

Was the use of the media appropriate or not? 

2. Interface design / usability testing 

The principles of interface design have been described by several authors, for 
example (Dix et al, 1994; Reeves and Harmon 1994). In our project the 
software had to be simple to use and had to perform efficiently and robustly, 
leading to the following usability considerations: 

Could users start, login and logout of packages easily? 

Were users with disability supported? 

Were applications robust? 

Was unnecessary cognitive overhead avoided? 

Were instructions clear and easy to follow? 

Could users navigate, locate and orientate easily? 

Did users always know what to do next? 

Were users able to perform required tasks easily? 

Did learners have sufficient computer experience? 

3. Interest and User Satisfaction 
Our objectives in this area were simple: 

Did learners like using the application? 
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What features of the course were judged to be good or bad? 

Were screens clear and attractive? 

Did applications have the right ‘look’ and ‘feel’? 

Were media of high quality and did they add to the course? 

Were materials interesting or boring, interactive or passive? 

An evaluation scheme based on the above was produced which had the 
following format: 

• The objective to be tested 

• How it is to be tested - evaluation methods and detailed procedure. 

• When it should be tested (formative, summative) and by whom 

• Assessment criteria for the objective 

In the following section we describe the various methods we used to assess 
the material. 

Evaluation methods 

The following methods of evaluation were used in the project. 

• Expert evaluation 

• Analysis of logged data 

• Questionnaire methods 

• Interview methods 

• Video methods 

For each method we will describe how it was used, together with some of its 
benefits and limitations in the project. 

Expert evaluation 

The use of expert evaluators is described by Catenazzi and colleagues 
(1997), who took the role of less experienced users to identify usability 
problems. Expert evaluation has been described by Perisco (1996), who 
describes it as a form of subjective evaluation performed on prototypes. In 
the Horizon project, prototypes were distributed to subject, educational and 
learning difficulties specialist. Guidelines were developed at transnational 
meetings for use in these evaluations. 

We found that experts were very efficient in unearthing problems, but that 
language translation took more time than was anticipated. The need for 
regular updating meetings between evaluators and developers increased 
costs. The use of learning difficulty specialists saved much trial and error, as 
they were able to identify and correct potential problems early in the 
development stage. 

Analysis of logged data 
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The use of automated data collection methods has been described by 
Henderson and colleagues (1995), who state that the technique is 
unobtrusive, inexpensive, accurate and reliable. However, Laws and Barber 
found it difficult to gain high level insight from low level data capture methods 
(Laws and Barber 1989). We collected full tracking information including login 
and logout times, times spent on each screen and navigation information from 
which it was hoped that high level user intentions might be inferred. 

One important use of the method was the isolation of specific problem areas 
encountered by some learners. It was possible, for example, to identify 
learners who were moving through the course slowly from their log files. 
Potential reasons might include navigational, interface, domain, specific 
disability, language or other problems. It was then possible to use another 
method to identify the specific reason, for example interview or video. 

Logged data provided large amounts of information that was cheap to acquire 
yet expensive to process. Data files created were used for formative and 
summative evaluation and were useful for tutors to track learners’ progress in 
a course. The method translated well for evaluating other language versions. 



Questionnaire methods 

The design and use of questionnaires has been described in many places in 
the literature. For example, Karat (1988) states that the method is 
inexpensive, fast and easy to process. However, Henderson and colleagues 
(1995) caution over the use of questionnaire data taken in isolation. Three 
types of questionnaire method were developed for summative and formative 
evaluation of the Horizon material. 

• Paper based questionnaires were developed from the evaluation 
guidelines. A five-point scale was used to record information about user 
satisfaction, experiences and difficulties. Questionnaires were short and 
simple and were distributed widely for comment. 

• Multimedia versions of questionnaires with sound support and simple 
graphical interface were developed. 

• Group questionnaire methods were used to support learners with difficulty 
using questionnaires. Small groups of learners discussed and completed 
questionnaires together with their tutor at the end of each session. 

Although questionnaires were related directly to our evaluation objectives, we 
found that the general nature of our questions made it difficult to identify 
specific problems. Sometimes attributes rated on questionnaires did not exist 
in the application, for example sound support was rated highly when not used. 
Sometimes learners clearly did not understand questions. The multimedia 
questionnaire was considered to be a good feature by experts and simplified 
marking. The group questionnaire method was introduced to enable learners 
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with severe cognitive problems to contribute. It was more costly than simple 
questionnaire but provided information unobtainable elsewhere. Users were 
able to share experiences about the application with tutors and often this led 
to new information about the application. Videos of group sessions were 
found to be useful. 

Questionnaires were helpful in finding out general attitudes but the individual 
nature of the Horizon material, the specific problems of learners and the small 
sample size, made individually completed questionnaires less useful than 
other methods. The group questionnaire method has potential for the future. 

Interview methods 

Interviews were used in the formative evaluation in the Horizon project. 
Cordingley (1989) describes the use of a semi structured interview method 
that we adopted for this work. Structured components of the interview 
consisted of about thirty or so questions and were scripted based on the 
evaluation objectives. Interviewers were fully briefed on the evaluation 
objectives and were present while learners were using the Horizon material. 
Subjects could respond in any way to questions and were encouraged to 
explore issues. Interviews were recorded unobtrusively on video for later 
viewing. 

The open ended nature of the interview method used was important in 
locating problems that were missed by other methods. Interviews provided 
anecdotal information that was useful for developers. Video recording of 
sessions was useful as it recorded facial expressions and body gestures. 

The video recording did not appear to affect subjects, who were keen to 
participate. Interview methods are expensive and require effort to set up and 
process data. They did however provide large amounts of useful information 

Video methods 

Laws and Barber (1989) describe the use of video as a data capture method 
which they suggest is useful for collecting anecdotal evidence in addition to 
more structured data. The ability to view video repeatedly makes it more 
useful than simple observation. Video was used in two ways in our work. 
General information was provided by loosely structured video sessions with 
up to six learners at each session. Such sessions were scripted to record 
general features of applications. For example team working and interactions 
between learners and tutors were recorded. 

More formally structured scripted video sessions were also performed. For 
example learners were asked to login, logout, locate specific areas, find 
information and perform tasks. Videos of these actions were viewed by the 
development team later. 

The use of scripted video sessions was found to be efficient in the testing of 
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problem areas identified by expert evaluation or other method. It was 
possible to test critical areas of courses and such information led to rapid 
improvements in interface design. Group sessions were less useful for 
locating specific problems in the applications, but provided useful summative 
information on how courses were being used and the learning environment. 
Video methods were found to be expensive, yet were able to provide detailed 
and useful information and videos could be viewed repeatedly. 

Discussion 

The Horizon learning materials are currently being used to support training 
and work experience of six groups of learners in three countries. It is hoped to 
extend the project to include other European countries in the future. Several 
features of the project added to the difficulty of evaluation. These were: the 
requirement for rapid prototyping and material development, the learning 
difficulties dimension, the constructivist approach and need for individual 
configuration of learning, the transnational component and small scale of the 
project. Several authors have recommended that evaluation be situated in 
context (Squires and McDougall, 1996; Squires, 1997; Yildiz and Atkins, 
1993). In our project, evaluation took place with real learners and experts 
using the material in real training and vocational settings. 

A combination approach to evaluation was employed to overcome many of 
the above difficulties. All methods used in the project provided useful 
formative information for developers. We confirmed the findings of Henderson 
and colleagues (1993) that methods used in combination were better than 
any single method at identifying problems. This was especially true when 
methods were combined with expert evaluation. 

The evaluation did not attempt to measure cost effectiveness or to compare 
multimedia with other methods of delivering learning. Reeves (1991) points 
out problems involved in this type of comparison. Summative evaluation 
therefore, centred on how useful the applications were in supporting learning. 

Difficulties that arose due to the dispersed nature of the project were helped 
by regular transnational meetings. These were found to be very important in 
the evaluation process. Not only did such meetings help in sharing objectives 
and ideas, but were also important in maintaining momentum and providing 
deadlines for evaluators. 
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